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Parsing File Names 


Starting with Version 2.0, MS-DOS has supported file names that may include a drive name and a sub- 
directory path as well as the basic file name and extension. DOS accepts such names in system calls to 
open, delete or rename files and so do recent versions of Lattice and Microsoft C library functions. Most 
of the time, C programs can take file names from the command line or from console input and pass them to 
DOS without analysis. But sometimes it is necessary to pick a file name apart into its parts - the 
parse fnandget_ext functions in this article make it easy. 


The parse_fn function separates a fully qualified file name into parts: drive identifier, sub-directory path 
and the file name proper. The input file name is in C string form and the file name parts are stored in 
string form. Arguments passed to parse_fn specify memory areas where these parts are to be stored. 
If a part is ‘not present in the input file name, a null string is created for that part. The colon that ends the 
drive identifier is copied with the identifier itself. Leading and trailing '\' chars are copied with the sub- 
directory path name. This distinguishes the root directory path ("\\") from no path name (a null string 
") and makes it easy to reassemble the parts again. For safety, maximum lengths of 9, 127 and 12 are 
allowed for drive, path and file name parts. (Device names such as LPT1: are treated as legal drive 
identifiers.) For convenience, NULL pointers may be passed for parts of the file name that are not needed. 


get ext separates the file name and extension into separate C strings. The '. ' that separates the two is 
placed in the extension string. Names that begin with a '.' are treated as having a name and no 
extension; this takes care of current directory (".") and parent directory (". .") special cases. 


Both parse fn and get_ext use C library functions to find separator characters. The strchr 
function finds the first occurrence of a character in a string and st rrchr finds the last occurrence. When 
a file name part is found, the chk_cpy function checks it length, checks for a NULL destination pointer 
and calls xstrncpy to copy the characters. The xstrncpy function in xstring.c uses the library 
function memcpy to copy exactly n chars and then appends a '\0' at the end. (The strncpy library 
function does not place a '\0' at the end if the maximum number of chars are copied.) 


/* parse fn.c - parse a fully qualified file name into parts x / 

/* Pass NULL ptrs if you don't need parts of the file name */ 

/* parse fn - separates drive, path and file name ("name.ext") */ 

/* get fn - separates name and extension parts of a DOS file name x / 
#include "stdio.h" 7 ‘ : : 
#include "string.h" 
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#include "xstring.h" 


#define INVALID -1 
/* max lengths for file name parts */ 
#define MAX DRIVE ° /* alows "device:" */ 
#define MAX PATH 127 
#define MAX FILE #2 
#define MAX NAME 8 


#define MAX EXT 4 f*Aanekudes © 1). 4/ 


int parse _fn(fn,drive,path,file) /* separate file name parts */ 
/* 


Char *ffn ; 


file name "d:path\\name.ext" */ 


Char *drive ; /* put "ds" here */ 
char *path ; /* put “path\\" here with last '\\' */ 
char *file ; /* put "name.ext" here */ 
{ 

char *p ; /* points to part of string not yet */ 

/* pulled apart */ 

char “px. ; /* value returned by strchr() */ 

ie TY /* no. chars to be copied */ 

p= fn; 

px = strchr(p,':') ; /* look for colon */ 

if( px != NULL ) | 

NM=px-p+t+1+; /x found .-.copy thru ':'°*/ 
else n = 0 ; /* no ":" = copy null string */ 


ct 33 chk_cpy (drive,p,n,MAX DRIVE))/* check length and copy */ 
return( INVALID ) ; ie ok 
p=ptn; /* move past drive and ';!' */ 


px =-Strrchr(p,'\\') ; ./* look for end of path name */ 
if( px. != NULL ) 

nM=px-pt+l1+; /* found =~ copy thru last..?\\!:e7 
else n = 0 ; /* HO" \ A Copy tell” String -*7 
if( ! chk_cpy(path,p,n,MAX PATH) ) 

return( INVALID ) ; 
p=ptn =; /* move past path name and last '\! */ 
/* check file name length & copy it */ 

pe od chk_cpy (file,p,strlen(p),MAX FILE))/* copy name proper*/ 

Fecurnt 2NVALID ) .; 
returnt 0 ¥; /* 222? make same convention as chk_cpy ?? */ 


/* get ext - separate file name and extension */ 
/* assumes that the drive and path have already been removed */ 


int get ext (fn,name, ext) /* separate file name and extension */ 
char *fn ; /* file name string = "name.ext" */ 
char name[8] ; /* put name part here */ 
char ext[3] ; /* put ext. part here */ 
{ /* returns O0=success, -l=invalid */ 
Ie O. 
Gnar.*p: 7 
py = steecnr tin, *.*) °3 /* find period */ 


/* at start of string or missing ? */ 
2 


if( (p == fn) II (p == NULL) ) : 
{ n = strlen(fn) ; /* yes - whole string is name */ 


ext[0] = '\0O° ; /* make null extension */ 
} 
else 
i neep - fn. 3 /* no. chars in name part */ 
/* start at '.' and copy extension * / 


cA chk cpy (ext,p, strlen (p) ,MAX_EXT) ) 
return( INVALID ) ; 


} 

if( ! chk _cpy (name, fn,n,MAX_NAME) ) /* copy name part */ 
return( INVALID ) ; 

return( 0 ) }; /* return success */ 


int chk _cpy (to, from,n,maxn) /* check length and copy */ 


char *to ; /* dest. area - put string here x / 
char *from ; , /* source area */ 

Lint ny /* number of chars to copy x / 

int maxn ; /* check n against this limit */ 


} 


/* returns 1 if OK, 0 if too long x / 
if( n> maxn ) 
returm( 0 ) 37 
if( to != NULL ) 
xstrncpy (to, from,n); /* copy chars and put '\0' on end */ 
return (<i 93 


/* xstring.h - string functions to supplement library */ 
char *xstrncpy() ; 

/* function prototype */ 

/* char *xstrncpy(char *,char *,int n) ; x / 


/* xstring.c - string functions to supplement library */ 
#include "string.h" 
/* use "memory.h" for MSC 3.0 x / 


char *xstrncpy(to,from,n) /* copy chars and add end of string */ 


char *t0- 3 /* destination area * / 
char *from ; /* get chars here */ 


{ 


} 


int n ; /* number of chars to copy */ 


memcpy (to, from,n) ; 
to[n] = '\0' ; 
return({ to ) 3 


Functions with a Variable Number of Arguments 


Although most C functions expect the same number of arguments each time they are called, there is a need 
for functions that accept a variable number of arguments. The printf and scanf library functions are 
an example; console input and output would be unnecessarily tedious if each scanf or print f call could 
only handle a single value or if there were a different function for each data type. Most high level 
languages provide I/O functions that handle a number of arguments of any data type, But they use a 
special syntax reserved for such built-in functions. Since C uses the same syntax for all functions, you 
can do anything that library functions such as printf and scanf do. The string and array concatenation 
functions in this article illustrate practical uses for functions accepting variable numbers of arguments and 
implementation techniques. 


The concat function below concatenates any number of strings placing them in a destination area we 
specify. After the arguments specifying the address and maximum size of the destination area, concat 
expects one or more addresses of source strings to be concatenated. A NULL pointer value signals the end 
of the list of strings. | 

Only the first source string is declared in concat. Initially, the parg variable points to that first source 
string address. After a string is copied, parg is incremented to point to the next argument. The loop ends 
when parg points to a NULL pointer. The arguments are addresses of C strings (char ~*) type and 
pargisapointertothem(char **), 


Handling a list of arguments of different types is a little more complicated. The catchr function 
concatenates arrays of characters without '\0' characters to mark the end of the arrays. For each array to 
be added to the destination, both an address and a lengih are passed as arguments. The parg variable 
again points to the address of the array being copied and the pn variable points to the corresponding length 
argument. Since parg and pn point to different data types, casts( (int *) and (char **) ) ensure 
correct type conversion. catchr returns the length of the destination array. 


Although C allows functions to accept a variable number of arguments, it doesn't provide a portable way 
to implement this feature. In the concat function we must assume that parameters are stored 
contiguously in memory with the first parameter at the lowest address and the rest following at 
successively higher addresses. In the catchr function, we also assume that pointer and integer 
arguments are stored contiguously without padding between. These assumptions are valid for Lattice, 
Microsoft and most other MS-DOS C compilers but they might not apply for other environments. 


Sometimes an array can be used to pass a variable number of arguments. The concat v function below 
accomplishes the same result as concat but in a more portable way. The disadvantage is that the array must 
be declared and values inserted before each call to concatv. The execv and exec1v library functions 
in recent versions of Lattice and Microsoft C use arrays while exec and execl accept a variable number 
-of arguments directly. 


Techniques for accepting a variable number of arguments can be more dangerous than normal C code. 
You should test such functions with the small memory model before using them with any memory model 
with 32 bit pointers. 


/* concat.c - concatenate strings - any number */ 
#include “"stdio.h" 
#include “string.h" 


char *concat (to,tomax, froml)/* concatenate strings */ 


char *to ; /* put the result here */ 
ine tomas ; /* maximum length allowed */ 
char *froml ; /*x first string to concatenate */ 


/* follow by ptrs to other strings */ 
/* and terminate with a null ptr */ 


{ /* returns - to=success, NULL=failure */ 
char **parg ; /* point to arguments passed */ 
char *pto ; /* copy next string here */ 
nt 8. } | /* length of next string */ 
parg = & froml ; /* point to 1st from string pointer */ 
pCO = 2O 3 
while( *parg != NULL ) /* stop when NULL ptr found */ 


{ n = strlen(*parg) ; 
tomax = tomax - Nn ; 


if( tomax < 0 ) /* room for next string ? */ 
 - yeturn( NULL ) ;/* no - return failure */ 
strepy (pto, *pang). 7/* . yes. -. copy it. */ 
pto = pto + h ; {*®. & advance dest. pointer */ 
pargt+ ; /* point to next arg */ 


} 


return (to) ; 


int catchr(to,tomax,from1,nl) /* concatenate arrays of chars */ 


char *to ; /* put the result here */ 

int tomax ; /* maximum chars allowed */ 

char *froml ; /* first array to concatenate */ 
os eae 3 9 /* size of first array */ 


/* follow by ptrs and sizes for other arrays */ 
/* and terminated with a null ptr */ 


{ /* returns- no chars=success , -1=failure */ 
char **parg ; /* point to array argument */ 
int *pn + /* point to size argument */ 
char *pto ; /* copy next array here */ 


tnt neopoy ; 


parg = & froml ; /* point to ist array */ 
pto = to ; 

ncopy = 0 ; 

while ( sparg != NULL) 


Lt pre Tae *) f para + 1) 4 
tomax = tomax - *pn ; 


1£¢ tomax < 0 ) /* is there room for it ? */ 
return( -1 ) ; /* no - return failure */ 

memcpy (pto, *parg,*pn) ;/*yes -copy chars */ 

ncopy += *pn ; /* add to char count */ 

pto = pto + *pn ; /* advance dest. pointer */ 

parg = (char **) (pn + 1) ; /*\pointyto next array .*/ 


} 


return(ncopy) ; 
} 


/* concatv.c - concatenate strings with array of ptrs as arg. */ 
#include "stdio.h" 
#include "string.h" 


typedef char *pchar ; /* pointer to char type */ 
char *concatv(to,tomax,fromv)/* concatenate strings */ 
char *te: ; /* put the result here */ 
int tomax ; , /* maximum length allowed */ 
pchar fromv[] ; /* array of pointers to strings */ 
/* terminate array with a null ptr */ 
{ /* returns - to=success, NULL=failure */ 
char **parg ; /* point to arguments passed */ 
char *pto ; /* copy next string here */ 
0G tt 3 /* length of next string */ 
parg = fromv ; /* point to lst string pointer */ 
pto = to ; 
while( *parg != NULL ) /* stop when NULL ptr found */ 


{ n = strlen(*parg) ; 
tomax = tomax - n ; 
if( tomax < 0 ) /* room for next string ? */ 

return( NULL ) ;/* no - return failure */ 

strcepy(pto,*parg) ;/* yes - copy it */ 
pto = pto +n ; /[* & advance dest. pointer */ 
pargt++ ; /* point to next arg */ 

} 

return(to) ; 


} 


String Functions and Header Files 


All the C source code in this newsletter deals with C character strings and uses string-oriented library 
functions. The functions used are defined in the ANSI standard and provided in recent versions of Lattice 
and Microsoft C. If your compiler does not provide the header file string.h to declare these functions, 
create the following string.h file: 


/* string.h - declare string functions */ 
/* use with compiler versions 1.xx or 2.xx */ 
char *strcpy() ; 
char *strncpy() ; 
enar *strcat{) ; 
char *strncat() ; 
tnt strcme-() ; 
Lint strnchpt) ; 
char *strehr() ; 
int strcespn() ; 
char *strpork(). ; 
char “stxrrenrt) ; 
int strspn() ; 
int strlen() ; 
/* define memxxx in terms of Lattice specific functions */ 
#define memcpy (to, from,n) (movmem (from, to,n),to) 
#define memset (to,c,n) (setmem(to,n,c),to) 
#define memmove(to,from,n) (movmem(from,to,n) ,n) 


Finding Auxiliary Files 


You often need to access auxiliary files such as help files, overlays and configuration files from C 
programs. In a simple PC configuration with two floppy disks you can assume they are on drive A:. But 
on a PC with two hard disks and many sub-directories, finding auxiliary files can be a problem. One 
approach is to require that such files be in a fixed location. (For example, cNc for Lattice C or \include for 
Microsoft C header files.) But since many PCs have network connections or more than one logical hard 
disk drive, this isn't an adequate solution for software that must run on many PCs. There isn't a universal 
solution but the testarg.c source file illustrates some effective solutions for the MS-DOS environment. 
Placing an .EXE file and all its auxiliary files in a single sub-directory is a common installation procedure. 
Starting with Version 3.1 of MS-DOS, command.com passes the program name to programs along with 
command line arguments. Lattice C and Microsoft C (Versions 3.00 and later) set argv [0] to point to 
the actual program name. Testarg.c uses the parse fn and concat functions developed elsewhere in 
this issue to construct a help file name in the same directory as the program itself. For older versions of 
DOS or the C compiler, argv [0] points to a dummy name so this technique is not useful for all PCs. 


DOS also passes a pointer to the environment area to programs it executes. This area contains a series of 
C strings of the form NAME=VALUE, with a null string terminating the list. The DOS PATH command 
sets a list of sub-directory paths to be searched for a program to be executed. Since the PATH list is 
stored in the environment area, C programs can also use it to find auxiliary files. The path list may contain 
several directory names separated by semi-colons. Testarg.c uses the scanpath function to extract a 
single directory path for each call. You must check for auxiliary files in each directory until you find them. 


You can define your own environment variable using the DOS SET command. For example, recent 
versions of Lattice and Microsoft C can use the INCLUDE environment variable to locate files named in 
#include statements. The final section of testarg.c prompts for an environment variable name and uses 
getarg to find its value. A sample output from testarg follows the listings. 


Older versions of Lattice and Microsoft C do not provide the getarg function. An equivalent is provided 
on supplementary disk #9. 


/* testarg.c - test argv[0] */ 
#include “stdio.h" 

Finolude “stdlib.h" 

char help file[201] ; 


main (argc, argv) 

int-arge. ; 

char *argv[] ; 

{ 
ae 
Chas -Griaveit20)] | pathiilzs} . 
char s[30] ; 
enar .*p=3 


for (i=0;i<argc; itt) 

t+ Printte are 42d = <%s> \n",i, argvii))..3.4 
parse fn(argv[0],drive,path,NULL) ; 
printf(" drive=<%s> \n path=<%s> \n",drive,path) ; 
concat (help file,200,drive,path,"help",NULL) ; 
printf(" help file name = <%s> \n",help file) ; 
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p = getenv("PATH") ; /* get pointer to path variable */ 
printf("\n\n PATH VARIABLE = <%s> \n",p ) ; 
while( p != NULL ) 

{ scanpath(&p,path) ; /* get next name from path var. */ 


if( *path == '\0O' ) /* quit when null string returned */ 
break ; 
printf (" $s> \n",path) ; 


} 7 
printf("\n environment variable (UPPER CASE) :"); 
gets(s) ; 
printf(" ts=<%s> \n",s,getenv(s) ) ; 


/* scanpath.c - get one path name from path=list */ 
/* assume list is like "name;name;...;name" */ 
#include "stdio.h" 

#include “string.h" 


int scanpath(start,name) 

char **start ; /* points to starting address */ 
char *name ; /* put next name here */ 

{ 


char *end ; 


end = strenr(*start,.'*;').¥ 
if( end == NULL ) . /*®* no ; - entire string is one name */ 
{ strcpy(name,*start) ; 
*start = *start + strlen(*siart);/* next start at \0O */ 
, = | 


else 
{ xstrncpy (name, *start,end-*start) ;/* copy up to the ';! */ 
*start = end + 1 ; /* next time start after ';' */ 
} 
} 
Testarg - Sample Output 
C>SET 


COMSPEC=C : \COMMAND .COM 
PATH=D:\LC;C:\;C:\BOOK.DIR\PROGRAMS .DIR;C:\WP.DIR 
BPATH=d: \brief\macros 


C>testarg 

arg OQ - <C:\NEWSLETT.DIR\ISSUEY.DIR\TESTARG.EXE> 
drive=<C:> 

path=<\NEWSLETT.DIR\ISSUE9.DIR\> 

help file name = <C:\NEWSLETT.DIR\ISSUE9.DIR\help> 


PATH VARIABLE = <D:\LC;C:\;C:\BOOK.DIR\PROGRAMS .DIR;C:\WP.DIR> . 
<D:\LC> 
<Ce\> 
<C:\BOOK.DIR\PROGRAMS .DIR> 
<C:\WP.DIR> 


environment variable(UPPER CASE) :BPATH 
BPATH=<d: \brief\macros> 


A Message for IBM and Microsoft: Get Off Your Butts 


When IBM introduced the PC, they provided real value to the personal computer market. The PC's 
standard architecture and large installed base created a market for ready-to-use software distributed in a 
single format. The PC's cost effectiveness stimulated new applications and the market stimulated new 
developments in printers and hard disks. PC users, software developers, Microsoft and IBM have all 
benefitted from the IBM PC. But the PC market may stagnate soon from lack of innovation. 


Much of the growth in the PC market is due to the continuing expansion of the PC's capabilities: more 
RAM, hard disks, faster processors, and more operating system capabilities. IBM and Microsoft get - 
windfall profits from the PC market and in return they should be extending the PC and PC-DOS 
architecture to make it more capable and cost effective. But for the past 18 months, neither IBM nor 
Microsoft has done much to earn their profits and the PC architecture has stagnated. Here are some things 
IBM and Microsoft should have done by now to support growth in the PC market: 


1. Prepare for 286/386 protected mode. - IBM or Microsoft will eventually release a version of PC-DOS 
that provides access to more than 640K bytes of RAM through the protected mode of the 80286 and 80386 
processors. To be immediately useful, that new DOS should run existing PC software. This requires 
some advance planning by IBM and Microsoft: LINK program extensions to the .EXE file structure and 
publicized guidelines for software developers and compiler writers. This should have been done 12 to 18 
months before a protected mode DOS was introduced so that most existing software would be usable 
under the new DOS. (I am sure that IBM and Microsoft have been working with some large developers, 
but they should have been quicker and more public about it.) 


2. Put a 80286 or 80386 in every PC sold - If software developers are to make use of more than 640K in 
new software, there must be a large installed base of 286 based machines. Because IBM has been slow to 
introduce a low-cost 286 based PC replacement, 8088 and 8086 based PCs still make up most of the 
installed base. IBM should discontinue 8088 based PC models and introduce a 286 based PC priced 
under $ 1500. 


3. Put good graphics capability in every PC sold - PC graphics applications need a large installed base of 
graphics-capable PCs to be successful. IBM should replace the text only monochrome adapter by a 
Hercules-compatible adapter. Its 720 X 350 black and white graphics are fine for many applications and 
save $ 500-600 over a color system. IBM should also discontinue the CGA adapter, replacing it with a 
lost cost EGA adapter that runs all CGA software too. | 


4. Introduce an open 386 architecture machine - Much of the growth in PC use will come from 
applications that require lots of RAM memory and lots of processor power. The 386 processor is much 
better for such applications than is the 286 processor in the AT. An open-architecture 386 based PC is 
also IBM's best hope to steal CAD/CAM business from Sun and Apollo workstations. A good 386 
architecture will provide a 32 bit bus for high resolution graphics and other fast I/O devices. It should also 
include a hard disk controller that supports faster transfer rates than the current 5 megabits per second and 
a tighter interleave factor than the AT's 3:1 value. The 386 machine should allow for more than 16 
megabytes of RAM memory even if current chips make such large memory sizes uneconomic. 


5. Define high performance standardized interfaces for graphics -The lack of a standard graphics interface 
for CRTs, laser printers and plotters has slowed development of IBM PC based graphics applications. 
The Virtual Device Interface, IBM's first effort, was too slow and to be useful in most products. 


Microsoft's Windows is too large, too slow and too late to be useful with the installed base of 8088/8086 
PCs. 


6. Provide real-time support in PC-DOS - Many new real-time applications for PCs need low overhead, 
multi-tasking support from the operating system. ATs are very cost effective for multi-channel data 
communications, voice response, laboratory data collection and many other real-time applications. These 
applications need DOS support for preemptive scheduling of multiple tasks, concurrent I/O and processing 
and better real-time clock support. Interactive multi-tasking products like as Windows, Double DOS and 
Desq View have far too much overhead to be useful for real-time applications. 


Faced with increasing competition from generic clones and its own sagging market share, IBM may feel 
that they should make the PC architecture more proprietary and close out the competition. It must be 
tempting to conclude that the PC's open architecture is the source of IBM's problems. The real lesson is 
that no company, not even IBM, can retain leadership in the personal computer marketplace without 
adding value to its products. It isn't too late but time is running for IBM and for the PC marketplace. 


Books on Using MS-DOS Facilities 


Information on using DOS services is often vital for producing quality programs that are bulletproofed and 
have a polished user interface. Neither Microsoft nor IBM describes everythin g a DOS programmer needs 
to know nor do their manuals describe DOS services clearly enough. There are a number of books 
available to help a DOS programmer; here are my selections for the sheep and the goats. 


The Disk Operation System Technical Reference manual published by IBM (No. 6024213, $ 109) is the 
Listerine of DOS books. I use it and I hate it. It is the primary public document defining the PC-DOS 
standard. Other manufacturers of PC-compatible computers have published similar manuals for their 
versions of MS-DOS and Microsoft has published a generic version. All versions are incomplete - no 
information on bugs, side effects and the subtle implications of DOS calls. Recent versions provide 
examples of making single DOS calls but are useless in understanding how to use several DOS calls to 
accomplish something useful. 


Advanced MS-DOS, Ray Duncan. Microsoft Press, 1986. This is the best reference available on making 
DOS calls. The 64 page reference section on DOS calls includes good notes on side effects and 
interactions between DOS calls. The Lotus /Microsoft /Intel expanded memory interface is documented 
and examples of its use provided. BIOS calls are used in examples but not discussed thoroughly. There 
are lots of short examples with program fragments and several longer examples with complete programs: 
an interrupt driven terminal emulator program, a file dump program and a toy shell (command line 
interpreter) for example. Discussions of exception handling and device drivers include good advice and 
good examples; the section on debugging device drivers will greatly help in getting a device driver 
working. There are flaws in this book: a Table of Contents without detail and a mediocre index make it 
difficult to find topics covered in the book. DOS bugs are not mentioned and differences between DOS 
versions are not cited adequately. But overall, this is a fine book with a good balance of reference material, 
advice and examples. 


Programmer's Guide to the IBM PC, Peter Norton. Microsoft Press, 1985. This book is mostly 


regurgitation of the DOS Technical Reference and the PC hardware Technical reference manuals. There is 
little information about traps, bugs and side effects of DOS calls or about using several DOS calls to 
accomplish something. Until Duncan's book appeared, this was the best reference for MS-DOS. Norton 
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covers using BIOS and PC hardware more thoroughly but Duncan's book is a better guide to MS-DOS 
programming. 7 


MS-DOS Developer's Guide, John Angermeyer and Kevin Jaeger. Howard Sams, 1986. This isn't a 
complete guide to using DOS; it is a mixed bag with some material on MS-DOS services. Chapter 1 with 
practical advice on using the MASM assembler and chapter 2 with assembler language coding techniques 
are excellent. Parameter passing via global variables, registers and the stack are all discussed as are 
position independent programs, reentrant and recursive code and the role of the 8088 segment registers in 
these techniques. A chapter on real-time programming offers good advice but no examples. Chapters on 
MS-DOS memory management and DOS device drivers mostly document those DOS services. A chapter 
on compatibility is poorly worded and obsolete. Chapters on LANs, high level languages and recovering 
data from RAM memory have little value. Out of 11 chapters, 2 are excellent, 5 are somewhat useful, 1 is 
questionable and 3 are worthless. Some sections of the book are poorly worded and the material in some 
chapters is poorly chosen but the good parts contain lots of practical insight from good programmers. 


Programmer's Guide to MS-DOS ,Dennis Jump. Reston Publishing, 1984. and The IBM PC-DOS 
Handbook, R.A. King. Sybex 1983. Both books are purely regurgitation of the IBM tech reference 
manuals. Neither author provides new information, insight or even good examples. Norton's book does 
what these books do and much better. 


MS-DOS Technical Reference Encyclopedia. Microsoft Press, 1986. This 1000 page, $ 135 monster is 
the largest and worst book on the subject. Hundreds of pages are wasted on irrelevant descriptions of 
every DOS command (including the TYPE command), every EDLIN command and every DEBUG 
command. The book is poorly arranged and a poor table of contents and index make it difficult to find 
anything. Large numbers of factual errors and terrible writing make the content less useful. Finally, the 
book adds little to the DOS technical reference manual - no more information, no clearer presentation and 
no more useful examples. Microsoft Press has recalled this book but some bookstores still have copies on 
the shelf. 


New Supplementary Disk 


A new supplementary disk (# 9) is available with all source code from newsletter issues #1, 3, 4, 6, 7, 8 
and 9. It also contains a getarg function for use with Version 2.xx of Lattice or Microsoft C. See the 
back page of this issue for descriptions of source code in these issues. 


Supplementary disk #5 is still available. It contains source code from issues #2 and #5 as well as support 
for writing drive drivers and memory resident programs in C. (These are Lattice specific for now.) 


By the time issue #10 is published, I hope to have a disk of public domain C source code and a disk of 
programmer-oriented shareware. | 


Time To Renew? 


If your mailing label says 9/86 this is your last issue. If it says 11/86, you have one more issue coming. 
Either way, it's time to renew your membership. 
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Reviews of C Interpreters 


This article reviews two C interpreters, C-Terp and Instant-C, for the IBM-PC environment. They are 
fairly expensive and intended to supplement a normal C compiler to improve programmer productivity. C 
compilers all work the same way, but C interpreters differ greatly in functionality and implementation. So 
first we look at what a C interpreter can contribute in each phase of the software development process. 


Getting a Clean Compile - An interpreter can speed up compiling a C source file. It may also speed 
up correcting syntax errors as they are found. And a one or two keystroke transition from editing to com - 
piling (and back) allows the programmer to concentrate on his source code rather than DOS commands. 
The table below compares compile times for C-terp and Instant-C to those for Lattice and Microsoft C. 


Lat. 3.1 MSC 4.0 Inst-C 2.0 C-Terp 2.1 


50 lines 16 sec. 26 3 <3 
150 lines 38 85 11 <3 
500 lines 87 149 30 <3 
singlefun. - -- <3 -- 


Unit Testing Source Files - Testing each C source file before executing the entire application is the 
key to producing bug-free quality programs. Writing test programs longer than the source file being tested 
is tedious and time-consuming work. A tool that makes unit-testing quicker and more pleasant makes 
thorough testing more likely to happen. Faster editing and compilation speeds up producing test 
programs. A interpreter that executes individual C functions without test programs is the best solution. 
Source level debugging features - breakpoints, single step and C expression evaluation - are also valuable. 


Getting the Bugs Out - Good unit-testing makes testing the application as a whole easier and less 
exciting, but there will be some bugs and it is necessary to verify correct operation. Good source level 
debugging features are important in this phase. To be useful here, a C interpreter must work with full-size 
programs composed of lots of C source files, requiring lots of RAM memory. The interpreter must also 
execute C code fast enough to work though normal operation of the program. 


Tracking Down Bugs in Released Programs - When bugs appear in released programs, they must 
be replicated and understood before they can be fixed. A debugger that works with a production .EXE file 
is more useful than an interpreter that doesn't. 


Producing a C interpreter that demonstrates the concept is relatively easy but it takes a lot of effort to 
complete the job and produce a tool that is useful to a working programmer. The following questions 
address the practical issues that separate the toys from the worthwhile tools. 


Is it safe? - C interpreters are used to execute programs with the bugs present. Does it ensure that DOS 
and its data structures are not corrupted? Is your C source code safe from being lost or corrupted? The PC 
environment provides little protection but good documentation can help you use an interpreter safely. 


What errors does it catch? - Catching errors such as bad pointer values, subscripts out of range, 
function argument or return value mismatches can speed up debugging C programs dramatically. At the 
least, the interpreter should do as well as current compilers using function prototypes. 


Does it support the full C language? - If it doesn't support the full K&R C language, it isn't a 
serious tool. Function prototypes are a major improvement in C; a C interpreter that doesn't support them 
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is out of date. Other ANSI extensions such as the void. data type, enumerated types and structures 
assignment and parameter passing are less important but should be supported soon. 


Does it fit with your compiler? - You still need the Lattice or Microsoft compiler to produce 
production .EXE programs - an interpreter's penalty in speed and RAM size is just too great for serious 
programs. You should be able to compile and link C source files with your normal compiler without 
change. The interpreter should provide all your compiler's library functions and the same .H header files. 
Support for .ASM modules, .LIB files and programs writing directly to the PC screen are important too. 


Is it a Toy? - Good C style uses lots of source files and header files. An interpreter must work with- 
many source files without slowing down or running short of memory. It should provide a way to load a 

list of source files automatically. Seriousl programs may require lots of RAM for code and data; a limit of 

64K bytes in the interpreter makes it less useful. An interpreter may also take up RAM of the PC's 640K 

limit - a problem for large programs. Execution speed need not be as fast as that of a compiler but very 

slow execution may limit its usefulness. 


Will it waste my time or save it? - An interpreter will not save time if its operation is clumsy or its 
documentation provides little help. Since C interpreters are not standardized in features or operation, that 
documentation is your main resource. 


C-Terp - A Simple but Effective Interpreter 


The C-Terp interpreter provides an editor, compiler, linker and debugger in a single program requiring 
140-190 Kbytes of RAM memory. From an overall menu, a single keystroke selects those functions or 
execution of your C program. This single keystroke interface is used throughout the program with single 
line menus. Where file names are required, the file last used is the default choice. C-Terp has the usual 
edit, compile, link and execute cycle, but it keeps files in RAM for much faster results than a compiler. 


On a PC-XT, compile times are under three seconds for source files of up to 500 lines. Moving between 
editor, compiler and debugger requires only one or two keystrokes. When the compiler detects a syntax 
error, it displays an error message; pressing a key enters the editor with the cursor positioned at the error. 


C-Terp frequently displays unnecessary messages such as "Press a key to enter the editor" and requires an 
extra keystroke. The full-screen editor is adequate but block operations and commands for joining and 
splitting lines are clumsy. But C-terp's simplicity and speed makes these minor complaints. For entering 

C source files and getting clean compiles, C-Terp is the best development tool I have seen. | 


For testing individual source files, C-Terp's speed is useful but it does requires you to construct whole test 
programs. Linking is performed when you run program; it is instantaneous for small programs. The C- 
Terp debugger provides single stepping, executing C expressions and execution tracing at C source level. 
A full-screen display of source files makes it easy to debug programs without printed listings. The 
debugger provides access to global variables and local variables (only for the function interrupted.) No 
features are provided for debugging .ASM modules. A single keystroke enters the editor with the cursor 
positioned as in the debugger display. 


C-Terp has some limitations for testing entire programs. The interpreter requires 140-190K byes of RAM 
but overflowing source files and symbol tables can be moved to disk file. The large memory model 
supports applications with large amounts of data. Execution speed is 0.05 to 0.25 as fast as Lattice C - 
adequate for testing individual source files but not fast enough for many whole applications. Executing a 
simple loop that copied 2000 integers took many seconds. 
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C-Terp checks assignments using pointers and subscripts and checks addresses passed to library 
functions. This provides some security but it isn't bulletproof with C-Terp's large memory model. The 
manual's discussion of pointer checking and other safety issues is incomplete and poorly organized. Since 
C-Terp keeps C source files in RAM, the loss or corruption of those files is also a concern. I produced 
two crashes while executing C programs in C-Terp; in both cases C source files were lost. C-Terp detects 
mismatches in function return values and assignments to invalid memory areas but bad pointer values and 
subscripts are detected only in assignments outside valid static and local memory areas. Numbers and 
types of function arguments are not checked at all. 


Full K&R C is supported and the void type but enumerated types and structure assignment are treated as 
errors. Function prototypes are ignored but not flagged as errors. C-Terp provides a minimal C library 
but a batch file adds most Lattice or Microsoft library functions. A similar procedure links into C-Terp C 
functions compiled with Lattice or Microsoft C or .ASM modules; since it takes 5 minutes and creates a 
new copy of C-Terp, it is mostly useful for making permanent additions to C-Terp's library. C-Terp 
works well with programs that write directly to the PC screen after you discover how to tell C-Terp that 
the screen memory area is a valid area for assignments. 


C-Terp is well implemented but there are some bugs. Functions that accept a variable number of 
arguments are not handled correctly. When the debugger displays a(char *) value, it also displays the C 
string at that address. Since the value may not point to a C string, this can display thousands of chars at a 
rate of two chars per second before a '\0' character is found. Pressing Control-Break twice to stop this 
display crashes C-Terp. Tabs in the command line are not recognized as white space in setting up the 
argv[] array. An #include file search path that specifies the current directory (C: for example.) 


Basic operation of C-Terp is so easy that you don't often need the manual. That is a good thing since the 
manual is quite inadequate. Important topics such as differences between C-terp and a compiler, memory 
model, safety and error detection are poorly discussed. The manual is poorly organized with section titles 
unrelated to the practical use of the features. 


Instant-C - Powerful but Awkward 


Instant C is very similar in operation to APL and LOGO interpreters. C statements and expressions 
entered from the keyboard are executed immediately; this provides a mechanism for executing functions 
and examining and setting variables. Individual C functions can be executed directly - no main function is 
necessary. Instant-C compiles and links C source code automatically without explicit commands. 


Instant-C's editor operates on single C functions or variables or whole source files. When you leave the 
editor, this code is automatically compiled. Within Instant-C, C code is grouped into memory files that 
correspond to APL workspaces. At the end of a session, memory files can be saved to .C source files. 
Existing C source files can be loaded into a memory file; they are compiled as they are loaded. 


A full-screen source level debugger provides single-stepping, tracing execution and examining and setting 
memory areas. Since C expressions can be executed immediately, you can examine and set C variables 
without remembering Instant-C debugging commands. 


Instant-C's full-screen editor is fast and has satisfactory features for editing C source files. When you 
leave the editor, automatic compilation is quick compared to Lattice or Microsoft C. When you edit a small 
function rather than an entire source file, compile times are under 3 seconds. When the compiler detects a 
syntax error, the editor is re-entered and the cursor placed at the line where the error was detected. This 
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makes finding and correcting most syntax errors fast. But Instant-C does not allow you to save changes to 
a memory file if syntax errors are present. This often requires you to save changes to a scratch file and 
retrieve them later. Omissions or misspelling in .H files that cause syntax errors in .C files can't be 
corrected simply in Instant-C. Although Instant-C is faster than a conventional editor and compiler, it is 
not a pleasant tool for entering C source code and removing syntax errors. 


Instant-C is much better as a tool for unit-testing C source files. Since it can execute C functions directly, 
you can avoid writing extensive test programs. The debugger's full-screen display of C source code 
makes setting breakpoints and stepping through a function easy. C program execution can be stopped 
with the Control-Break key even if the program is in a loop. 


Instant-C takes up over 200K bytes of RAM and limits data space to 64K bytes. This will limit its useful - 
ness for testing whole applications. Instant-C does not store source code in RAM - memory files contain 
compiled code and data. Instant-C's execution speed (0.5 of Lattice's) will rarely be a problem. 


Instant-C protects against misbehaving C functions well. The small memory model limits damage when 
bad addresses are passed to .OBJ modules. I did manage to crash Instant-C when I tried to load a:source 
file with syntax errors and then quit the editor without saving that source file. The manual provides no 
information about Instant-C’'s protection against program errors. 


Full K&R C and the void type are supported but enumerated types and structure assignments are not 
supported yet. Function prototype declarations are flagged as syntax errors; this is awkward if you want 
to use prototypes with Lattice or Microsoft C. Functions called with a variable number of arguments are 
treated as errors - a deviation from standard C. As it is delivered, Instant-C’'s library and header files are 
not a good fit with Lattice or Microsoft C. PC specific functions for making DOS calls and general 
software interrupts are included but names don't match Lattice or Microsoft C. A procedure for adding 
.OBJ and .LIB files to Instant-C is poorly documented; I had trouble getting it to work. 


Instant-C works well when many source files are required. The debugger switches easily from one source 
file to another. Redirecting Instant-C input to a file of load commands provides a way to load a number of 
source files automatically. Programs that write directly to the screen can be executed under Instant-C. 


Documentation is the real failing in Instant-C. The manual has over 300 pages of reference material and is 
readable and neat. But Instant-C is a complicated program with over 50 commands and a number of 
awkward limitations. The concepts underlying Instant-C's operation are not explained nor are enough 
practical examples provided. You are on your own to discover how Instant-C works and how to work 
around its quirks. Rational Systems provides good support but adequate documentation is still needed. 


Conclusions 


C-terp works like a conventional editor, compiler and source debugger. But it is much faster for editing 
and compiling and much slower for debugging. Its design is simple but its speedy operation makes it very 
pleasant to use. C-Terp speeds up the process of entering source code and getting clean compiles. It is 
also useful for unit-testing although its large memory model is not entirely safe. At $ 300 list, it's worth a 
good look. (With decent documentation, it would get a rave review.) 


Instant-C is a complex product thst needs a bit more work. Awkward editing features make it a poor 
substitute for a normal editor and compiler. But Instant-C is the best tool available for testing individual 
source files. Ineffective documentation and lots of commands make learning and using Instant-C a lot of 
work. Instant-C is worth the $ 495 list price but not the time and effort required. 
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